AITopics | counterfactual vision-and-language navigation

Collaborating Authors

counterfactual vision-and-language navigation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Counterfactual Vision-and-Language Navigation: Unravelling the Unseen

Neural Information Processing SystemsDec-23-2025, 22:57:32 GMT

The task of vision-and-language navigation (VLN) requires an agent to follow text instructions to find its way through simulated household environments. A prominent challenge is to train an agent capable of generalising to new environments at test time, rather than one that simply memorises trajectories and visual details observed during training. We propose a new learning strategy that learns both from observations and generated counterfactual environments. We describe an effective algorithm to generate counterfactual observations on the fly for VLN, as linear combinations of existing environments. Simultaneously, we encourage the agent's actions to remain stable between original and counterfactual environments through our novel training objective-effectively removing the spurious features that otherwise bias the agent. Our experiments show that this technique provides significant improvements in generalisation on benchmarks for Room-to-Room navigation and Embodied Question Answering.

counterfactual vision-and-language navigation, name change, unravelling, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.41)

Add feedback

Review for NeurIPS paper: Counterfactual Vision-and-Language Navigation: Unravelling the Unseen

Neural Information Processing SystemsJan-23-2025, 12:06:16 GMT

Summary and Contributions: This paper introduces a method for generating *counterfactual* visual features for augmenting the training of vision-and-language navigation (VLN) models (which predict a sequence of actions to carry out a natural language instruction, conditioning on a sequence of visual inputs). Counterfactual training examples are produced by perturbing the visual features in an original training example with a linear combination of visual features from a similar training example. Weights (exogenous variables) in the linear combination are optimized to jointly minimize the edit to the original features and maximize the probability that a separate speaker (instruction generation) model assigns to the true instruction conditioned on the resulting counterfactual features, subject to the constraint that the counterfactual features change the interpretation model's predicted timestep at every action. Once these counterfactual features are produced, the model is trained to encourage it to assign equal probability to actions in the original example when conditioning on the original and the counterfactual features (in imitation learning), or to obtain equal reward (in reinforcement learning). The method improves performance on unseen environments for the R2R benchmark for VLN, and also shows improvements on embodied question answering.

counterfactual feature, counterfactual vision-and-language navigation, training example, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback